From Distributions to Labels: A Lexical Proficiency Analysis using Learner Corpora
نویسندگان
چکیده
This paper presents work on how we can link word lists derived from learner corpora to target proficiency levels for lexical complexity analysis. The word lists present frequency distributions over different proficiency levels. We present a mapping approach which takes these distributions and maps each word to a single proficiency level. We are also investigating how we can evaluate the mapping from distribution to proficiency level. We show that the distributional profile of words from the essays, informed with the essays’ levels, consistently overlaps with our frequency-based method, in the sense that words holding the same level of proficiency as predicted by our mapping tend to cluster together in a semantic space. In the absence of a gold standard, this information can be useful to see how often a word is associated with the same level in two different models. Also, in this case we have a similarity measure that can show which words are more central to a given level and which words are more peripheral.
منابع مشابه
Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition
This paper presents work on how we can link word lists derived from learner corpora to target proficiency levels for lexical complexity analysis. The word lists present frequency distributions over different proficiency levels. We present a mapping approach which takes these distributions and maps each word to a single proficiency level. We are also investigating how we can evaluate the mapping...
متن کاملA Comparative Analysis of Lexical Bundles in Journalistic Writing in English and Persian: A Contrastive Linguistic Perspective
This paper investigates the use of ‘lexical bundles’ in two broad corpora of journalistic writing. The aim of this study is to compare the use of lexical bundles in the two domains, one consisted of newspaper articles written in English and published in England and the other one comprised of newspaper articles written in Persian from Iranian publications. For this purpose, the frequency...
متن کاملA Comparative Analysis of Lexical Bundles in Journalistic Writing in English and Persian: A Contrastive Linguistic Perspective
This paper investigates the use of ‘lexical bundles’ in two broad corpora of journalistic writing. The aim of this study is to compare the use of lexical bundles in the two domains, one consisted of newspaper articles written in English and published in England and the other one comprised of newspaper articles written in Persian from Iranian publications. For this purpose, the frequency...
متن کاملWhat Is Lexical Proficiency? Some Answers From Computational Models of Speech Data
& Lexical proficiency, as a cognitive construct, is poorly understood. However, lexical proficiency is an important element of language proficiency and fluency, especially for second language (L2) learners. For example, lexical errors are a common cause of L2 miscommunication (Ellis, 1995). Lexical proficiency is also an important attribute of L2 academic achievement (Daller, van Hout, & Treffe...
متن کاملComparing Lexical Bundles in Hard Science Lectures; A Case of Native and Non-Native University Lecturers
Researchers stated that learning and applying certain set of lexical bundles of native lecturers by non-native lecturers would help students improve their proficiency through incidental vocabulary input. The present study shed light on the lexical bundles in hard science lectures used by Native and Non-native lecturers in international universities with the main purpose of analyzing the structu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016